AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Video Description Generation

# Video Description Generation

Tarsier 7b
Tarsier-7b is an open-source large-scale video-language model from the Tarsier series, specializing in generating high-quality video descriptions with excellent general video understanding capabilities.
Video-to-Text Transformers
T
omni-research
635
23
Video Blip Flan T5 Xl Ego4d
MIT
VideoBLIP is an enhanced version of BLIP-2 capable of processing video data, using Flan T5-xl as the backbone language model.
Video-to-Text Transformers English
V
kpyu
40
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase